Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Add common controller based on presumed tfcli interface #14

Merged
merged 18 commits into from
Aug 20, 2021

Conversation

turkenh
Copy link
Member

@turkenh turkenh commented Aug 16, 2021

This PR adds a common controller implementation based on a presumed terraform cli interface.

Generated controllers will call terraform.SetupController to setup the controllers. For example:

// Setup adds a controller that reconciles VPC managed resources.
func Setup(mgr ctrl.Manager, l logging.Logger) error {
	return terraform.SetupController(mgr, l, &v1alpha1.Vpc{}, v1alpha1.VpcGroupVersionKind, getProviderConfigFn)
}

We will also need to generate the required function on managed resource structs to satisfy the Terraformed interface.
For example:

package v1alpha1

import (
	"github.com/crossplane-contrib/terrajet/pkg/conversion"
)

func (mg *Vpc) GetTerraformResourceType() string {
	return "aws_vpc"
}

func (mg *Vpc) GetTerraformResourceIdField() string {
	return "id"
}

// GetObservation of this VPC
func (mg *Vpc) GetObservation() ([]byte, error) {
	return conversion.TFParser.Marshal(mg.Status.AtProvider)
}

// SetObservation for this VPC
func (mg *Vpc) SetObservation(data []byte) error {
	return conversion.TFParser.Unmarshal(data, &mg.Status.AtProvider)
}

// GetParameters of this VPC
func (mg *Vpc) GetParameters() ([]byte, error) {
	return conversion.TFParser.Marshal(mg.Spec.ForProvider)
}

// SetParameters for this VPC
func (mg *Vpc) SetParameters(data []byte) error {
	return conversion.TFParser.Unmarshal(data, &mg.Spec.ForProvider)
}

@turkenh turkenh requested a review from ulucinar August 16, 2021 22:46
@turkenh turkenh force-pushed the common-controller branch 3 times, most recently from c4904b2 to 887f311 Compare August 17, 2021 21:14
@turkenh turkenh marked this pull request as ready for review August 17, 2021 21:15
@turkenh turkenh requested a review from muvaf August 17, 2021 21:15
Copy link
Member

@muvaf muvaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the parts I could but in some cases I'm not sure what the alternatives are since the underlying CLI implementation is not there yet.

pkg/terraform/controller.go Outdated Show resolved Hide resolved
pkg/terraform/controller.go Outdated Show resolved Hide resolved
pkg/conversion/cli.go Outdated Show resolved Hide resolved

// Observe is a Terraform Cli implementation for Observe function of Adapter interface.
func (t *Cli) Observe(tr resource.Terraformed) (ObserveResult, error) {
b, err := t.getClientBuilderForResource(tr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we prepare the TF workspace in the filesystem and then create a client specific for each operation by calling Build{Observe,Create,Update}Client. Would it make sense to prepare the workspace in Connect method of ExternalConnector implementation? I can't see the content of Build{Observe,Create,Update}Client calls but I have a hunch that once we prepare the workspace (with terraform.tfstate and main.hcl.json) they all can use the same folder to run the commands.

Copy link
Member Author

@turkenh turkenh Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is already the case. With the help of a handler (which we use CR uid for that), we always use the same workspace. However, with current structure, this implementation detail is hidden from the upper layer controller. So, I wouldn't prefer to explicitly do something to prepare the workspace in the Connect method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got your point after our offline discussion and updated it accordingly.
We now build client once in connect method which is enabled by the reworked interface that we ended up with together.

pkg/conversion/cli.go Outdated Show resolved Hide resolved
import jsoniter "github.com/json-iterator/go"

// TFParser is a json parser to marshal/unmarshal using "tf" tag.
var TFParser = jsoniter.Config{TagKey: "tf"}.Froze()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can borrow some properties from the fastest configuration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we confident that configuration wouldn't cause any data loss for any generated resource?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Floats lose precision up until 6-digits, which is fine I believe but not a strong opinion here.

pkg/conversion/statev4.go Outdated Show resolved Hide resolved
pkg/conversion/statev4.go Outdated Show resolved Hide resolved
pkg/conversion/json.go Show resolved Hide resolved
return []string{"Unknown", "Observe", "Create", "Update", "Delete"}[o]
}

type OperationInProgressError struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can return informational fields in cases where the operation is in progress rather than an error struct because it's a valid state to be in progress rather than an erroneous one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we discussed this option with @ulucinar and finally agreed that it makes more sense to it this way since trying to kick an operation while another one is in progress is an error case from tfcli point of view. Handling this on the upper layer (cli adapter) sounds more reasonable. In other words, it is a valid state for controller/reconciler but not a valid one for tfcli.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. We would not like to allow concurrent Terraform operations (technically commands) in the same Terraform workspace. From tfcli's PoV it's an error to request a new operation before an ongoing one completes. This is also in sync with Terraform CLI's behaviour: if an ongoing Terraform command holds the state lock and we try to run a new command, the latter command cannot acquire the lock and fails. Thus, the proposal was to return an error to the caller, who wants to initiate a new operation via tfcli, if there is an already ongoing operation. And similar to k8s.io/apimachinery/pkg/api/errors package, provide utilities to deduce error types so that the caller can take appropriate actions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

Copy link
Member

@negz negz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few mostly nitpicks. :)

pkg/conversion/adapter.go Outdated Show resolved Hide resolved
pkg/conversion/cli.go Outdated Show resolved Hide resolved
pkg/conversion/cli.go Outdated Show resolved Hide resolved

// ObserveResult represents result of an observe operation
type ObserveResult struct {
Completed bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make any sense for these Completed fields to be a channel rather than a boolean? i.e. One that blocked until the action was complete, similar to a context's Done channel?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. We are calling Observe in each reconcile and do not carry any information/state to the next reconcile loop. So, couldn't figure out how to leverage such a channel in this context.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bool used here is not meant to block its observer as it is currently part of an async interface. However, if we decide to make tfcli.Observe blocking, then we can remove it.

pkg/conversion/statev4.go Outdated Show resolved Hide resolved
pkg/conversion/cli.go Outdated Show resolved Hide resolved
pkg/conversion/cli.go Outdated Show resolved Hide resolved
turkenh added 14 commits August 19, 2021 15:55
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
@turkenh turkenh force-pushed the common-controller branch from 4eebc77 to 64ec148 Compare August 19, 2021 13:00
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
@turkenh turkenh force-pushed the common-controller branch from 5514a31 to 514f95d Compare August 19, 2021 19:58
@turkenh
Copy link
Member Author

turkenh commented Aug 19, 2021

@muvaf @ulucinar I've updated the PR based on our latest discussions today.

It was a bit tricky since it is not possible to test without the actual tfcli implementation but I believe I covered most of the cases that we can think of. The most challenging part was properly persisting annotations (state+external-name) & spec (late-init) and status without losing any data and I would prefer to get back to that with a later PR when things are testable.

That said, I would suggest merging this one if you don't see anything blocking, so we can iterate more easily without dealing with open PRs, rebases and conflicts.

tfcli, err := conversion.BuildClientForResource(tfcb, tr)
*/

tfcli, err := conversion.BuildClientForResource(nil, tr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the discussion that assumes the TF workspace is initialized when we build the client. It's not the case atm because tfcli.Builder.Build is a synchronous call and workspace building is potentially a long running task.

This has not been an issue so far because the tfcli.Client interface is fully async (Observe, Create, Destroy, etc.), excluding some getters (GetState, GetHandle, etc.) During each call tfcli.Client initializes a workspace if needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may operate under the assumption that workspace initialization is not costly (because we have the means to do that by avoiding provider plugin downloads each time a workspace is initialized) but I do not like the idea of operating under extra assumptions. For example, development would require special setup & configuration on a developer's machine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may operate under the assumption that workspace initialization is not costly (because we have the means to do that by avoiding provider plugin downloads each time a workspace is initialized) but I do not like the idea of operating under extra assumptions.

As we discussed offline, it seems like how we build the workspace affects whether Observe should be blocking or not as well. I think it should be safe to assume that the workspace initialization will not include plugin download because we will ship the binary in the container image. For development experience, I have a hunch that we can come up with ways that'll make it closer to production. For example, we can put the binary to a directory that is usable in local machines as well, like /usr/bin, and require it to be downloaded before running main.go in both container and development machine. We could have a local dev script like this one to download it for them.

IMO, the extra cost in the code for making workspace building async is not worth the development experience we'll get by assuming developer might not have the plugin and want terrajet download it automatically.

pkg/conversion/cli.go Outdated Show resolved Hide resolved
Copy link
Member

@muvaf muvaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! No big concerns apart from the case where the user gives external name and resource does not exist.

// An Adapter is used to interact with terraform managed resources
type Adapter interface {
Observe(ctx context.Context, tr resource.Terraformed) (Observation, error)
Update(ctx context.Context, tr resource.Terraformed) (Update, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Update(ctx context.Context, tr resource.Terraformed) (Update, error)
Apply(ctx context.Context, tr resource.Terraformed) (Apply, error)

Or maybe CreateOrUpdate or Upsert, something to make it clear that it can create as well.

Comment on lines +163 to +182
if xpmeta.GetExternalName(tr) == "" {
// Terraform stores id for the external resource as an attribute in the
// resource state. Key for the attribute holding external identifier is
// resource specific. We rely on GetTerraformResourceIdField() function
// to find out that key.
stAttr := map[string]interface{}{}
if err = JSParser.Unmarshal(st.GetAttributes(), &stAttr); err != nil {
return nil, errors.Wrap(err, "cannot parse state attributes")
}

id, exists := stAttr[tr.GetTerraformResourceIdField()]
if !exists {
return nil, errors.Wrapf(err, "no value for id field: %s", tr.GetTerraformResourceIdField())
}
extID, ok := id.(string)
if !ok {
return nil, errors.Wrap(err, "id field is not a string")
}
xpmeta.SetExternalName(tr, extID)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an Initializer, alternative to NameAsExternalName?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. If we make it an Initializer, then setting the external name will be done in the next reconcile parsing the stored state annotation (again). I don't see an immediate problem with this but also do not see much of a value, would prefer to consider again later.

// resource specific. We rely on GetTerraformResourceIdField() function
// to find out that key.
stAttr := map[string]interface{}{}
if err = JSParser.Unmarshal(st.GetAttributes(), &stAttr); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a Get in jsoniterator that could help here.

pkg/conversion/cli.go Outdated Show resolved Hide resolved
import jsoniter "github.com/json-iterator/go"

// TFParser is a json parser to marshal/unmarshal using "tf" tag.
var TFParser = jsoniter.Config{TagKey: "tf"}.Froze()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Floats lose precision up until 6-digits, which is fine I believe but not a strong opinion here.

pkg/conversion/statev4.go Outdated Show resolved Hide resolved
tfcli, err := conversion.BuildClientForResource(tfcb, tr)
*/

tfcli, err := conversion.BuildClientForResource(nil, tr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may operate under the assumption that workspace initialization is not costly (because we have the means to do that by avoiding provider plugin downloads each time a workspace is initialized) but I do not like the idea of operating under extra assumptions.

As we discussed offline, it seems like how we build the workspace affects whether Observe should be blocking or not as well. I think it should be safe to assume that the workspace initialization will not include plugin download because we will ship the binary in the container image. For development experience, I have a hunch that we can come up with ways that'll make it closer to production. For example, we can put the binary to a directory that is usable in local machines as well, like /usr/bin, and require it to be downloaded before running main.go in both container and development machine. We could have a local dev script like this one to download it for them.

IMO, the extra cost in the code for making workspace building async is not worth the development experience we'll get by assuming developer might not have the plugin and want terrajet download it automatically.

return []string{"Unknown", "Observe", "Create", "Update", "Delete"}[o]
}

type OperationInProgressError struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

pkg/conversion/cli.go Show resolved Hide resolved
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
@turkenh turkenh merged commit a86c258 into crossplane:main Aug 20, 2021
@turkenh turkenh deleted the common-controller branch August 20, 2021 17:50
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants